Instruction Tuning with GPT-4

Peng, Baolin; Li, Chunyuan; He, Pengcheng; Galley, Michel; Gao, Jianfeng

Computer Science > Computation and Language

arXiv:2304.03277 (cs)

[Submitted on 6 Apr 2023]

Title:Instruction Tuning with GPT-4

Authors:Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao

View PDF

Abstract:Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed. In this paper, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning. Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks to the instruction-following data generated by previous state-of-the-art models. We also collect feedback and comparison data from GPT-4 to enable a comprehensive evaluation and reward model training. We make our data generated using GPT-4 as well as our codebase publicly available.

Comments:	8 pages. Work in progress. Project page: this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2304.03277 [cs.CL]
	(or arXiv:2304.03277v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.03277

Submission history

From: Baolin Peng [view email]
[v1] Thu, 6 Apr 2023 17:58:09 UTC (1,397 KB)

Computer Science > Computation and Language

Title:Instruction Tuning with GPT-4

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Instruction Tuning with GPT-4

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators